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EXECUTIVE  SUMMARY 


In  order  to  assess  the  potential  impact  of  the  accidental  introduction  of  an  organic  chemical 
into  the  environment,  information  is  needed  concerning  its  environmental  fate.  The  fate  of  an 
organic  chemical  in  the  environment  depends  on  a  variety  of  physical,  chemical  and  biological 
processes.  Mathematical  models,  which  attempt  to  integrate  these  processes,  are  widely  used  to 
predict  the  transport  and  distribution  of  organic  contaminants  in  the  environment.  Use  of  these 
models  requires  a  variety  of  input  parameters  which  describe  site  and  contaminant  physical- 
chemical  and  biological  characteristics.  Several  important  contaminant  properties  used  to  assess 
the  mobility  and  persistence  of  a  chemical  are  aqueous  solubility,  octanol/water  partition 
coefficient,  soil/water  sorption  coefficient,  Henry's  Law  constant,  bioconcentration  factor,  and 
transformation  rates  for  biodegradation,  photolysis  and  hydrolysis. 

One  major  limitation  to  the  use  of  environmental  fate  models  has  been  the  lack  of  suitable 
values  for  many  of  these  properties.  The  scarcity  of  data,  due  mainly  to  the  difficulty  and  cost 
involved  in  experimental  determination  of  such  properties,  has  resulted  in  an  increased  reliance  on 
the  use  of  estimated  values  for  many  applications. 

Quantitrtive  Structure-Property  Relationships  (QSPRs)  and  Quantitative  Property-Property 
Relationships  (QPPRs)  are  methods  by  which  properties  of  a  chemical  can  be  estimated  from  a 
knowledge  of  the  structure  of  a  molecule  or  from  another  more  easily  obtained  property.  Selection 
and  application  of  the  most  appropriate  QSPRs  or  QPPRs  for  a  given  compound  is  based  on 
several  factors  including:  the  availability  of  required  input,  the  methodology  for  calculating  the 
necessary  topological  information,  the  appropriateness  of  a  correlation  to  the  chemical  of  interest, 
and  an  understanding  of  the  mechanisms  controlling  the  property  being  estimated. 

A  microcomputer  program,  utilizing  molecular  connectivity  indices  (MCI)-propeny,  total 
molecular  surface  area  (TSA)-propeny  and  propeny-property  correlations  and  UNIFAC  derived 
activity  coefficients,  was  developed  to  provide  a  fast,  economical  method  to  estimate  aqueous 
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solubility,  octanol/water  partition  coefficients,  vapor  pressures,  organic  carbon  normalized  soil 
sorption  coefficients,  bioconcentration  factors  and  Henry's  Law  constants  for  use  in  environmental 
fate  modeling.  The  structural  information  for  the  MCI  and  UNIFAC  modules  can  be  input  using 
Simplified  Molecular  Input  Line  Entry  System  (SMILES)  notation  or  connection  tables  generated 
from  a  commercially  available  two-dimensional  drawing  program.  The  TSA  module  accepts  3-D 
cartesian  coordinates  entered  manually  or  directly  reads  coordinate  files  generated  by  molecular 
modeling  software.  In  the  MCI,  TSA  and  Property-property  modules,  the  user  can  select  from 
either  "universal"  or  class  specific  regression  models  for  each  property.  To  aid  the  user  in 
choosing  the  most  appropriate  regression  model(s),  the  program  automatically  suggests  the  most 
appropriate  regression  model  based  on  the  structure  of  the  compound.  In  addition,  the  statistics 
and  list  of  compounds  used  in  developing  the  model  can  be  displayed.  For  the  regression  based 
modules,  assessments  of  accuracy  based  on  the  95%  confidence  interval  and  the  estimated 
precision  of  the  experimental  values  are  provided  along  with  the  estimated  property  value. 
Additional  correlation  models  can  be  easily  added  to  PEP  by  the  user.  The  database  of  measured 
properties,  used  in  the  development  the  property  estimation  modules,  and  the  Level  1  Fugacity 
Model  are  also  linked  directly  to  PEP.  The  current  status  and  use  of  the  program  will  be 
described. 
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OBJECTIVES  OR  STATEMENT  OF  WORK 


The  primary  goal  of  this  project  is  to  develop  a  microcomputer-based  decision  support 
system  utilizing  Quantitative  Structure  Property  Relationships  (QSPRs)  and  Quantitative  Property 
Property  Relationships  (QPPRs)  to  predict  the  physical-chemical  properties  of  an  organic  chemical 
which  are  necessary  to  model  its  environmental  fate.  The  specific  properties  that  are  being 
investigated  include:  aqueous  solubility  (S),  octanol/water  partition  coefficient  (Kow),  vapor 
pressure  (Pv),  organic  carbon  normalized  soil/water  partition  coefficient  (Ko^).  Henry’s  Law 
constant  (H),  and  bioconcentration  factor  (BCF). 

In  order  to  achieve  the  primary  goal  of  this  research,  the  following  specific  objectives  are 
being  accomplished: 

1 .  Compile  an  accurate  database  of  experimentally  determined  values  of  S,  Kow,  Pv,  Koc,  H,  and 
BCF  for  a  wide  variety  of  organic  compounds.  Include  compounds  exhibiting  a  broad  range  of 
physical  and  chemical  properties  and  expected  mobility  and  persistence. 

2 .  Using  the  database  developed  in  Objective  1 ,  evaluate  and  refine  existing  methods  and/or  develop 
new  methods  for  estimating  these  contaminant  properties  using  QSPRs  and  QPPRs. 

3  Develop  a  microcomputer-based  decision  support  system  which  incorporates  the  methods 
developed  in  Objective  2,  to  allow  the  prediction  of  environmental  fate  and  transport  properties  of 
an  organic  contaminant  upon  inputting  its  structure.  Provide  an  estimate  of  the  accuracy  of  the 
predicted  value  using  the  decision  suppxjrt  system. 

4.  Test  the  ability  of  the  decision  support  system  developed  in  Objective  3  to  provide  an  accurate 
estimate  of  these  environmental  fate  and  transport  properties.  This  will  be  done  using  a  test  set  of 
chemicals  for  which  accurate  experimental  values  are  available. 

5.  Compare  the  decision  support  system  developed  in  Objective  3  to  other  widely  used  property 
estimation  techniques. 


3 


BACKGROUND  AND  SIGNIHCANCE 


Mathematical  models  are  often  used  to  estimate  the  fate  and  impact  of  organic  chemicals  in 
the  environment.  These  models  often  idealize  the  environment  as  a  system  of  connected 
compartments,  i.e.  water,  soil,  sediment,  air  and  biota.  The  complexity  of  these  models  range 
from  simple  steady  state  models  to  non-steady  state  models  which  include  a  large  number  of 
compartments,  transport  between  compartments  and  degradation  processes. 

Use  of  these  models  requires  a  variety  of  input  parameters  which  describe  site  and 
contaminant  physical-chemical  and  biological  characteristics.  Aqueous  solubility  (S), 
octanol/water  partition  coefficient  (Kow),  the  organic  carbon  normalized  soil/water  sorption 
coefficient  (Koc),  vapor  pressure  (Pv),  Henry's  Law  constant  (H),  and  bioconcentration  factor 
(BCF)  are  considered  key  properties  used  to  assess  the  mobility  and  distribution  of  a  chemical  in 
environmental  systems. 

One  major  limitation  to  the  use  of  environmental  fate  models  has  been  the  lack  of  suitable 
values  for  many  of  these  properties.  The  scarcity  of  data,  due  mainly  to  the  difficulty  and  cost 
involved  in  experimental  determination  of  such  properties  for  an  ever  increasing  number  of 
synthetic  chemicals,  has  resulted  in  an  increased  reliance  on  the  use  of  estimated  values. 
Quantitative  Property-Property  Relationships  (QPPRs)  and  Quantitative  Structure-Property 
Relationships  (QSPRs)  have  been  used  by  environmental  scientists  and  engineers  to  obtain 
estimated  values  for  a  variety  of  physical/chemical  properties  for  use  in  environmental  fate  and 
assessment  modeling. 

Quantitative  Property-Property  Relationships  (QPPRs),  based  on  the  relationship  between 
two  properties  as  determined  by  regression  analysis,  are  used  to  predict  the  property  of  interest 
I  rom  another  more  easily  obtained  property.  Frequently,  the  regression  expressions  are  expressed 
in  terms  of  the  log  of  the  two  properties.  Researchers  have  found  that  a  number  of  environmental 
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properties  can  be  related  to  one  another  in  this  manner.  For  example,  QSPRs  have  been  developed 
to  estimate  S,  Koc  and  BCF  from  Kow  and  Koc  and  BCF  from  S  [1-3]. 

Quantitative  Structure-Property  Relationships  (QFPRs)  are  methods  by  which  the 
properties  of  a  chemical  can  be  inferred  or  calculated  from  a  knowledge  of  the  structure  of  a 
molecule.  QSPRs  often  take  the  form  of  a  correlation  between  a  structurally  derived  parameter(s) 
and  the  property  of  interest.  For  example,  relationships  between  structurally  derived  parameters, 
such  as  molecular  connectivity  indices  (MCIs)  and  total  molecular  surface  area  (TSA)  and 
properties  such  as  S,  Kow,  BCF,  and  H  have  been  reported. 

Molecular  connectivity  developed  by  Randic  [4]  and  refined  and  expanded  by  Kier  and  Hall 
[5-7]  is  a  method  of  bond  counting  from  which  topological  indices,  based  on  the  structure  of  the 
compound,  can  be  derived.  For  a  given  molecular  structure,  several  types  and  order  of  MCIs  can 
be  calculated.  Information  on  the  molecular  size,  branching,  cyclization,  unsaturation  and 
heteroatom  content  of  a  molecule  is  encoded  in  these  various  indices  [5].  MCIs  have  been  used  to 
predict  Koc  [8,9],  S  [I],  Kow  [10],  H  [11]  and  BCFs  [12]. 

A  direct  estimation  of  molecular  surface  area  based  on  the  concept  of  van  der  Waals  radii, 
TSA,  has  been  correlated  with  properties  such  as  S,  Kow,  Pv  and  H  [13-22].  Several  different 
algorithms  have  been  developed  to  calculate  TSA  which  require  the  3-D  atomic  coordinates  of  the 
solute  molecule  and  the  van  der  Waals  radii  of  solute  and  solvent  molecules  as  input  [19,23]. 

Group  contribution  methods  are  another  im|jortant  category  of  QSPRs.  The  basic  idea  of  a 
group  contribution  method  is  that  while  there  is  an  enormous  number  of  chemical  compounds, 
both  synthetic  and  naturally  occurring,  the  number  of  functional  groups  that  make  up  these 
compounds  is  much  smaller.  A  single  numerical  value  is  assumed  to  represent  the  contribution  of 
each  functional  group  (i.e.  a  specified  atom,  a  group  of  atoms  bonded  together  or  structural  factor) 
to  the  physical  property  of  interest.  It  is  also  usually  assumed  that  the  contributions  made  by  each 
group  are  independent  of  each  other.  By  summing  up  the  values  of  the  various  fragments  or 
groups  the  property  of  interest  can  be  directly  calculated. 
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The  UNIFAC  (UNIQUAC  Functional  Group  Activity  Coefficient)  group  contribution 
method  [24-26], originally  developed  to  estimate  liquid  phase  activity  coefficients  in  mixtures  of 
nonelectrolytes,  has  been  used  by  environmental  researchers  to  estimate  S  and  Kow  [27-31].  In 
this  technique,  the  activity  coefficient  is  divided  into  two  parts,  a  combinatorial  part  which  reflects 
the  size  and  shape  of  the  molecule  present  and  a  residual  portion  which  depends  on  functional 
group  interactions.  Various  parameters,  such  as  van  der  Waals  group  volumes  and  surface  areas 
and  group  interaction  parameters,  are  input  into  a  series  of  equations  from  which  the  combinatorial 
and  residual  parts  are  calculated.  Values  for  the  group  parameters  have  been  tabulated  and  can  be 
found  in  the  literature[25,26].  UNIFAC  is  specifically  designed  to  take  into  account  interactions 
between  groups  and  is  appropriate  for  multiple  solute/solvents  systems.  UNIFAC  also  permits 
estimates  to  be  made  as  a  function  of  temperature. 

In  most  cases,  more  than  one  estimation  method  is  available  for  a  particular  property. 
Estimation  methods  however,  have  widely  varying  accuracies  and  indiscriminate  use  of  these 
techniques  can  result  in  large  errors.  Selection  and  application  of  QSPR  or  QPPR  methods 
requires  varying  degrees  of  expertise  that  depend  on  the  structure  of  a  particular  chemical  of 
interest,  knowledge  of  the  mechanism  of  the  process,  the  extent  of  the  database  used  to  develop  the 
QSPR  or  QPPR  and  the  complexity  of  the  structural  analysis  required  to  relate  structure  to  the 
property.  For  example,  some  QSPR  and  QPPRs  are  broader  than  others  in  the  range  of  chemicals 
that  are  covered,  and  some  methods  have  been  established  with  a  better  understanding  of  the 
mechanisms  or  properties  involved.  In  many  cases  estimation  methods  are  developed  from 
empirical  or  semi-empirical  correlations.  The  success  of  the  correlation  is  dependent  on  many 
factors  including  the  type  and  number  of  compounds  used  in  its  development. 

Incorporation  of  QSPR  and  QPPRs  into  a  computer  fonriat  is  a  logical  and  necessary  step 
to  gain  full  advantage  of  the  methodologies  for  simplifying  fate  assessment.  A  practical 
computerized  property  estimation  program,  utilizing  QSPR  and  QPPRs,  should  include  the 
following  attributes:  be  simple  and  flexible  to  use  for  both  experts  and  non-experts,  include 
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sufficient  statistical  information  regarding  the  development  of  the  QSPRs  and  QPPRs  so  that  the 
range  of  apphcability  of  such  models  can  be  evaluated  and  provide  an  indication  of  the  accuracy  of 
the  estimated  property. 

A  microcomputer  based  Property  Estimation  Program  (PEP),  utilizing  MCI-property,  TSA- 
property  and  property-property  correlations  and  UNDFAC  derived  activity  coefficients,  is  being 
developed  to  provide  both  experts  and  non-experts  with  a  fast,  economical  method  to  estimate  a 
compound’s  S,  Kow,  Pv,  Koc,  BCF,  and  H  for  use  in  environmental  fate  modeling.  The  user  can 
input  the  required  structural  information  for  the  MCI  and  UNIFAC  calculation  routines  using  either 
Simplified  Molecular  Input  Line  Entry  System  (SMILES)  notation  or  connection  tables  generated 
ChemDraw™  a  commercially  available  two-dimensional  drawing  program..  The  TSA  module 
accepts  3-D  atomic  coordinates  entered  manually  or  directly  reads  coordinate  files  generated  by 
molecular  modeling  software  such  as  Alchemy  H™  or  Chem3D  Plus™.  For  property-property, 
TSA-property  and  MCI-property  modules,  the  user  can  select  from  either  "universal"  or  class 
specific  regression  models.  To  aid  the  user  in  choosing  the  most  suitable  regression  model,  the 
program  automatically  suggests  the  most  appropriate  regression  model(s)  based  on  the  structure  of 
the  compound.  In  addition,  the  statistics  associated  with  each  model  can  be  displayed  along  with 
the  list  of  compounds  used  in  developing  the  model.  For  the  regression  based  modules, 
assessments  of  accuracy  based  on  the  95%  confidence  interval  and  estimated  precision  of  the 
experimental  values  are  provided  along  with  the  estimated  property  value.  Additional  correlation 
models  can  be  easily  added  to  PEP  by  the  user. 

A  chemical  property  database  (PEP.DB),  containing  experimental  values  of  S,  Kow,  H, 
Pv,  Koc,  and  BC!F  complied  from  a  variety  of  literature  sources  and  computerized  databases  was 
used  for  developing  the  MCI-property,  TSA-property  and  property-property  relationships  used  in 
PEP.  This  database,  which  currently  contains  over  700  chemicals,  is  linked  directly  to  PEP  and 
provides  the  means  for  the  user  to  search  for  chemical  compounds  by  full  or  partial  name  or 


synonym,  to  sort  the  compounds  by  name,  boiling  point,  melting  point,  or  molecular  weight,  and 
the  ability  to  transfer  to  any  of  the  property  estimation  modules. 

A  prototype  database,  containing  information  regarding  the  biodegradability  of  organic 
compounds  is  also  being  developed  for  incorporation  into  the  PEP  software  system.  This 
database,  currendy  containing  information  for  33  chemicals,  will  be  used  to  develop  and  evaluate 
relationships  between  structure  and  biodegradability.  If  successful,  the  resulting  structure- 
biodegradability  relationships  will  be  incorporated  into  PEP  during  the  third  year  of  the  project. 

To  illustrate  the  potential  application  of  PEP  the  property  estimation  modules  are  linked 
directly  to  the  Level  1  Fugacity  Model  developed  by  Mackay  [32].  This  simple  model  calculates 
the  equilibrium  distribution  of  an  organic  chemical  between  water,  air,  soil,  sediment,  suspended 
sediment  and  biota  phases  in  a  user  defined  world.  The  combination  of  PEP  and  the  Fugacity 
Level  1  model  provides  the  user  with  a  methodology  for  predicting  the  environmental  distribution 
of  an  organic  chemical  in  a  multi-phase  system  requiring  only  the  structure  of  the  chemical  of 
interest  as  input  The  current  status  and  use  of  the  PEP  system  will  be  described. 
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STATUS  OF  RESEARCH  EFFORT 


Description  of  Property  Estimarion  Program  (PEP) 

HyperCard 

The  PEP  software  system  is  a  HyperCard™  based  program  that  runs  on  Apple  Macintosh 
computers.  HyperCard,  which  is  bundled  with  every  Macintosh  sold,  offers  graphics,  information 
storage,  the  means  to  display  information  in  a  variety  of  formats,  the  ability  to  establish  links 
between  related  information,  a  high  level  language  (HyperTalk),  the  ability  to  extend  HyperTalk  by 
writing  new  commands  in  a  compiled  language  (i.e.  C  or  Fortran)  and  a  mechanism  to  transfer 
control  to  other  Macintosh  applications.  The  PEP  system  uses  all  these  features. 

HyperCard  treats  each  screen  full  of  information  as  a  card  and  each  set  of  related  cards  as  a 
stack.  Cards  can  contain  fields  for  data  and  buttons  for  action  procedures  to  operate  on  the  data  in 
the  fields.  This  allows  the  standard  Macintosh  interface  to  be  used  without  the  direct  use  of  the 
Macintosh  toolbox  routines  greatly  simplifying  programming.  In  order  to  create  a  user  interface 
the  programmer  simply  draws,  or  creates  the  buttons  or  fields  that  are  to  be  used.  The  link 
between  buttons,  fields  and  cards  is  done  through  HyperTalk.  HyperTalk  is  an  object  oriented, 
interpreted  language  which  allows  the  programmer  to  direct  the  flow  of  the  program  and  at  the 
same  time  allows  the  user  the  freedom  to  use  the  program  as  desired.  However,  large  repetitive 
tasks  and  complicated  computations  can  be  very  slow  if  HyperTalk  is  used.  HyperCard  also 
allows  the  programmer  to  link  external  functions  or  commands  which  are  written  in  a  conventional 
programming  language  as  a  means  to  speed  up  the  slow  interpreted  language  and  implement 
custom  features. 

PEP  requires  the  following  system  configuration  to  run:  a  Macintosh  Plus,  Macintosh  SE, 
or  Macintosh  11  computer,  with  a  hard  disk;  HyperCard  2.0  software;  Macintosh  system  software 
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version  6.0.5  or  greater,  running  under  MultiFinder;  and  a  minimum  of  2  megabytes  of  memory 
(RAM),  with  1000  kBytes  of  memory  allocated  for  HyperCard. 

The  PEP  system  currently  consists  of  five  HyperCard  stacks:  PEP  Processor,  PEP 
Models,  PEP  Help,  Chemical  Property  Database  and  Biodegradation  Database.  While  differing  in 
purpose  and  characteristics,  each  stack  uses  a  consistent  set  of  icons  and  underlying  programming 
technique.  Table  I  lists  some  icons  and  button  types  and  their  uses. 


Table  I  Symbols  and  button  types  used  in  PEP 


Icon  Title 

Action 

Retum  Allow 

T^kes  you  back  to  the  card  you 
were  at  prior  to  this  one 

Shows  Help  for  the  current  card 

Infonnation 

Shows  general  infoirnation 

Eye 

Shows  the  equations  or  statistics 

Book 

Shows  the  reference 

H<^  First  Card 

T^kes  you  to  the  opening  card  of 
HyperCard 

1  Pop  Up  Button 

Lets  the  user  choose  from  a  popup 
menu  list 

Rction  Button 

Initiates  some  acttion  or  calculation 

^  Check  Bok 

Select  one  or  more  from  a  list 

(D  Radio  Button 

Select  only  one  from  a  list 
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PEP  Processor 


This  stack,  which  is  divided  up  into  four  sections  or  modules,  contains  the  algorithms  for 
data  input,  calculations  and  output  of  the  estimated  physical-chemical  properties.  The  opening 
screen  of  PEP  is  in  the  form  of  a  flow  chart,  allowing  the  user  to  see  the  different  modules  and  the 
overall  organization  of  the  program.  Similarly,  each  module  described  in  the  following  sections,  is 
also  organized  in  a  flow  chart  form. 

MCI  Module  Upon  entering  the  MCI  module,  illustrated  in  Figure  1,  the  user  must  first 
input  the  necessary  structural  information  using  either  SMILES  notation  [33,34]  or  connection 
tables  generated  from  ChemDraw™,  a  commercially  available,  Macintosh  compatible  two- 
dimensional  (2D)  drawing  program.  SMILES  is  a  chemical  notation  language  specifically 
designed  for  computer  use.  It  is  a  method  of  "unfolding"  a  2D  chemical  structure  into  a  single  line 
of  characters  containing  the  structural  infonnation. 
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File  Edit  Go  Print  Misc. 


Chemical  Name:  chlorobenzene 
SMILES  String: . CJ - c (c c.c Ijj c c_1 _ 


1. input 
Structure 


2.  Calc. 
MCls 


dIspUg  MCls 


3.  Choose 
Prop. 

SS 
^  Kouj 
0  Po 

^  Koc 
SBCF 


4.  Choose  Regression 

Jieio 

Stats 

1  PEP 

Universal 

LIT. 

1  PEP 

Universal 

1  LIT. 

UMV;?ru! 

PEP 

Halogenaied  Aromatics 

LIT. 

1  PEP 

Universal 

1  LIT. 

Uw-vrui 

1  PEP 

Halogenated  Aromatics 

1  LIT. 

1  PEP 

Universal 

1  LIT. 

Uf>iverv.U 

Mgure 


le  MCI  method  card  from  PEP 


1 5.  Estimate 
Properties 


After  the  structural  information  is  entered,  MCls  can  then  be  calculated  using  a  set  of 
HyperCard™  external  functions  (XFCN)  written  in  the  programming  language  C  based  on  code 
described  by  Frazier  [35].  The  MCI  calculation  routine  in  PEP  calculates  simple,  bond  and  valence 
indices  of  several  types  (path,  cluster,  chain,  and  path/cluster)  and  orders  (0  through  6),  if 
possible,  for  each  molecule.  This  results  in  a  maximum  of  54  index  values  for  each  molecule 
which  can  be  displayed  on  screen  and/or  output  to  a  printer.  To  account  for  non-dispersive  force 
effects  on  aqueous  solubility  and  solubility  related  properties,  zero  through  six  order  A  valence 
path  indices  (Ax),  as  described  by  Bahnick  and  Doucette  [36],  are  calculated  by  PEP,  in  addition  to 
the  54  indices  described  above.  To  calculate  Ax  indices,  a  nonpolar  equivalent  is  made  by 
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substituting  C  for  O  or  N  atoms.  MCIs  are  calculated  for  the  nonpolar  equivalent  and  values  for 
Ax  can  be  computed  for  each  type  of  index  by: 

Ax  =  (X)np  -  X  (1) 

After  the  MCIs  are  calculated,  they  can  be  displayed  or  printed  if  desired  and  the  user  can  then 
choose  which  properties  are  to  be  estimated.  For  each  property,  two  categories  of  MCI-property 
relationships,  “PEP”  and  “LFT’,  are  displayed.  The  PEP  category  provides  a  list  of  all  MCI- 
property  relationships,  both  class  specific  and  “universal”,  that  were  developed  in  this  project 
using  the  experimental  values  reported  in  the  PEP  property  database.  “Universal”  MCI  property 
relationships  were  developed  using  all  available  experimental  data  for  a  given  property  regardless 
of  chemical  class.  Class  specific  MCI-property  relationships  were  developed  if  property  values 
were  available  for  a  sufficient  number  (10  or  greater)  of  compounds  within  a  particular  chemical 
class  (PCBs,  PAHs,  ureas,  etc.).  In  addition,  several  multi-class  MCI-property  correlations  were 
developed  for  more  broad  classes  of  compounds  such  as  halogenated  aliphatics,  halogenated 
aromatics,  etc.  To  aid  the  user  in  choosing  the  most  appropriate  model,  the  suggested  chemical 
classes  based  on  structure  are  flagged  with  a  diamond  in  the  popup  menu.  The  chemical  class  is 
determined  by  looking  for  a  group  of  atoms  and  bonds  between  the  atoms  that  distinguish  a 
chemical  class.  The  number  of  chemical  classes  that  are  chosen  by  the  program  will  be  the  number 
of  different  distinguishing  subgroups  that  are  found.  In  addition,  a  summary  of  the  regression 
statistics  and  list  of  compounds  used  to  develop  and  evaluate  each  MCI-property  relationship  can 
be  displayed  by  clicking  the  “eye”  or  “view  statistics  option”  found  at  the  left  of  each  regression 
model.  An  example  of  the  statistical  information  provided  for  each  MCI-property  relationship  is 
shown  in  Figure  2.  Information  displayed  on  the  statistics  card  includes:  the  MCI-property 
regression  equation,  the  list  of  chemicals  used  in  developing  the  regression  model,  the  standard 
errors  of  the  coefficients  in  the  regression  equation,  the  Analysis  of  Variance  (ANOVA)  table,  the 
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r2  value,  a  graph  of  the  predicted  vs.  estimated  values,  a  graph  of  the  residuals  vs.  the  predicted 
values,  a  graph  of  the  residuals  vs.  the  number  of  standard  deviations  and  appropriate  reference. 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  S  Universal 


Regression  Results 


Variable 

Coef. 

Std. 

Errer 

t 

Source 

RSS 

df 

MSS 

F 

Constant 

0.3917 

0.1376 

2.85 

Regression 

889.176 

Sm 

445 

446 

vpl 

-.9257 

0.0316 

-29.3 

Residual 

360.920 

0.997 

Avpl 

1.8251 

0.1047 

17.4 

Total 

1250.096 

3.4343 

0 

r2=7l.l% 

Hobs  — 

365 

S  = 

0.9985 

Predicted  vs.  Exp. 


Residual  vs.  Predicted  Residual  vs.  Prob. 


•5  2 
K  2 

5 -2 


-+■ 


4- 


-7.5  -2.5  0.0  2.5 

•xpicrmcntal  I09  S 


-6  -3 


-1 .5  0.0  1 .5 

number  of  standard 
deviations 


f. 


igure  2  Example  statistics  card  from  PEP 


The  “LIT’  MCI-property  correlations  were  complied  from  various  literature  sources. 
Information  regarding  the  number  and  type  of  compounds  included  in  these  models  is  provided  if  it 
was  available  in  the  original  reference.  Clicking  on  the  "book"  icon  will  display  the  reference  of 
the  “LIT’  regressions. 


After  choosing  the  most  appropriate  regression,  estimates  for  the  selected  properties  can  be 
made.  As  shown  in  Figure  3,  the  MCI  module  results  window  provides  an  estimate  of  the 
property  along  with  its  calculated  accuracy  based  on  both  the  95%  confidence  interval  calculated 
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igure  3  Results  card  from  PEP 


TSA  module  The  TSA  module  is  similar  in  operation  to  the  MCI  module.  The  user  must 
first  input  the  required  structural  information.  However,  unlike  MCIs,  the  calculation  of  TSA 
requires  information  describing  the  geometry  of  the  molecule  in  terms  of  its  3-D  atomic 
coordinates.  The  TSA  module  accepts  3-D  atomic  coordinates  entered  manually  or  directly  reads 
coordinate  files  generated  by  commercially  available,  Macintosh  compatible,  molecular  modeling 
software  such  as  Alchemy  n™  or  Chem3D  Plus™.  The  TSA  module  is  also  designed  to  accept 
files  generated  by  other  hardware/software  combinations  including  CONCORD  (Tripos 


Associates,  Inc.),  a  hybrid  expert  system  and  molecular  modeling  software  designed  for  the  rapid 
generation  of  high  quality  approximate  3-D  molecular  structures.  In  addition  to  the  3-D  molecular 
structure,  the  user  must  also  input  van  der  Waals  radii  for  each  of  the  atoms.  A  editable  table  of 
van  der  Waal  radii,  obtained  from  Bondi  [37]  for  most  common  atoms,  is  provided  within  the  TSA 
module.  Once  the  molecular  geometry  and  the  van  der  Waal  radii  are  input,  TSA  can  be  calculated 
using  a  XFCN  which  was  adapted  from  the  SAVOL2  algorithm  developed  by  Pearlman  [19].  In 
this  algorithm,  each  atom  of  a  molecule  is  represented  by  a  sphere  centered  at  the  equilibrium 
position  of  the  nucleus.  The  radius  of  the  sphere  is  equal  to  that  of  the  van  der  Waals  radius. 
Planes  of  intersection  between  spheres  are  used  to  estimate  the  contribution  to  surface  area  from  the 
individual  atoms  or  groups.  The  program  computes  the  surface  area  of  individual  atoms  or  group 
by  numerical  integration,  and  the  overlap  due  to  intersecting  spheres  is  excluded  from  the 
calculation.  TSA  is  calculated  by  the  summation  of  individual  group  contributions.  The  program 
also  allows  the  TSA  of  the  solute  molecule  to  be  calculated  after  the  addition  of  a  suitable  solvent 
radius.  A  more  detailed  description  of  the  TSA  calculation  method  is  provided  by  Pearlman  [19]. 

After  the  TSA  has  been  calculated,  the  user  then  chooses  the  properties  of  interest  and  a 
regression  equation  for  each  using  the  same  approach  as  described  in  the  MCI  module.  If  the 
SMILES  string  or  the  connection  table  is  also  input,  the  appropriate  chemical  classes  will  be 
flagged  in  the  popup  menu.  The  operation  of  the  TSA  module  from  this  point  on  is  identical  to  that 
of  the  MQ  module. 

UNIFAC  module  Like  the  MCI  module,  the  UNIFAC  module  (illustrated  in  Figure  4) 
requires  either  a  SMILES  string  or  a  connection  table  as  input.  An  XFCN  converts  the  structural 
information  provided  by  the  SMILES  string  or  connection  table  into  valid  UNIFAC  subgroups  and 
counts  the  number  of  each  subgroups  present.  In  order  to  break  the  structure  into  the  proper 
subgroups,  the  SMILES  string  or  the  connection  table  is  interpreted  and  the  information  is  put  into 
a  matrix.  Each  row  and  column  in  the  matrix  represents  an  atom  in  the  chemical.  The  matrix 
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contains  the  bond  order  between  the  two  atoms  that  corresponds  to  each  entry  in  the  matrix.  If  two 
atoms  are  not  connected  then  a  0  is  placed  in  the  corresponding  entry  in  the  matrix.  After  the 
matrix  is  built  the  algorithm  then  "asks"  specific  questions  about  each  atom,  its  neighbors,  and 
how  it  is  connected.  If  the  answers  to  a  set  of  questions  are  all  true  then  a  subgroup  was  found, 
the  atoms  are  put  together  and  the  matrix  is  reduced.  The  questions  are  then  repeated  and  the  next 
subgroup  is  chosen,  this  continues  until  no  additional  subgroups  are  found.  The  questions  are 
asked  in  a  specific  sequence  so  that  the  resulting  subgroups  are  independent  of  the  order  of  the 
atoms  in  the  matrix. 


File  Edit  Go  Print  Misc. 


M 


UNIFAC 


Chemical  Name:  chlorobenzene  _ _ _ _ 

SMILESString:  Ci-c(cccncc1  . . . 

UNIFflC  Groups:  1  ACCL5  ACH 


igure  4  UNIFAC  method  card  from  PEP 


i 


The  UNIFAC  method  for  calculating  activity  coefficients,  as  described  by  Grain  [38],  is 
implemented  using  both  HyperTalk  and  an  XFCN.  The  functional  group  interaction  parameters. 
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presented  by  Gmehling  et  al.  [26]  and  derived  from  vapor-liquid  equilibria  (VLE),  are  used  in  the 
calculation  routine  but  can  be  changed  by  the  user.  After  the  activity  coefficients  are  calculated  they 
can  be  displayed  along  with  relevant  intermediate  values  and  used  to  estimate  S  and  Kow  by  the 
following  expressions: 

Kow  =  0.1 15  T^ow  /yooo  (2) 

S  (mol/L)  =  55.6  /  Y»w  (3) 

where  y^w  is  the  activity  coefficient  of  the  chemical  infinitely  dilute  in  water  and  y<xK>  is  the 
activity  coefficient  of  the  chemical  infinitely  dilute  in  octanol  [27]. 

Propertv/Propertv  Module  Input  for  the  Property/Property  module,  shown  in  Figure  5, 
depends  on  the  properties  to  be  estimated  and  the  regression  models  used.  Thus,  the  user  must 
select  the  properties  to  be  estimated  and  the  regression  equations  to  be  used  before  any  input  values 
are  requested.  The  program  keeps  track  of  the  inputs  required  and  provides  the  appropriate  input 
fields.  If  available,  the  required  properties  can  be  imported  directly  from  the  associated  chemical 
property  database.  Infonnation  regarding  the  regression  statistics,  if  available,  is  also  provided  as 
previously  described  in  the  MCI  module.  After  the  necessary  properties  are  entered  into  the 
corresponding  input  fields,  the  properties  of  interest  can  be  estimated  and  the  results,  along  with 
the  95%  prediction  interv'al  (if  the  necessary  data  is  available)  can  be  viewed. 
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Pro^rtv/Propert^ 

Chemical  Name: ..cll.LQJ:a.6Le.0.?-e,ae.. 


1.  Choose 
Property 

S  S 

^  Koui 

^  Koc 

S  BCF 


2.  Choose 
Regression 


PEP 

LIT. 

Hansch  et  at.  1 968  from  Kov  | 

PEP 

Universal  from  S 

LIT. 

PEP 

. - . 

on 

KaHokhoff  et  a1.  1 979  from  Kow  | 

PEP 

Universal  from  Kov  I 

LIT. 

_ J 

3,  Input  Properties 


Look  in  Prop.  DB 


I09S  -2:360 _ I  moles  Mi 

log  lOnt  2.840 _ 

io«rN 
l»gH 
1(»9  Koc 
tog  ocr 
K 

M.V. 


4.  Estimate 
Properties 


Note:  AH  values 


igure  5  Property/Property  correlation  metl 


PEP  Models 

To  illustrate  the  practical  application  of  PEP,  an  additional  stack  called  PEP  Models  was 
developed.  This  stack,  which  contains  the  algorithm  for  the  Level  1  Fugacity  Model  [32],  is  linked 
directly  to  the  PEP  Processor,  but  can  also  be  used  independently.  The  Level  1  Fugacity  Model 
considers  a  unit  world  consisting  of  six  compartments:  air,  '»’?,ter,  soil,  suspended  solids, 
sediment,  and  biota.  T3ie  model  predicts  the  equilibrium  concentrations  of  the  chemical  of  interest 
in  each  compartment  using  the  fugacity  approach  described  by  Mackay  [32].  The  model  requires 
the  input  of  Koc,  H  and  BCF  which  can  be  read  directly  from  the  PEP  processor  or  the  PEP 
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chemical  property  database,  if  available.  In  addition  to  the  chemical  specific  properties,  the  density 
and  volume  of  each  compartment  must  be  specified  along  with  the  organic  carbon  content  of  the 
soil,  sediment  and  suspended  sediment.  An  editable  set  of  default  values  for  compartment  density, 
volume  and  organic  carbon  content,  as  suggested  by  Mackay,  is  provided.  A  complete  description 
of  the  model  has  been  given  by  Mackay  [32,39]. 


^  File  Edit  Go  Print 

Fugacity  Level  1 

Chemical  Name:  chlorobenzene 


I.  Property 
Ualues 

2.  Input  Enuironmental  Compartment  Ualues 

i  Saw Ovreiit Values  |  Values:  lUnit  Vor1dJ>cfau1t  I 

Uokfor 

Valwstii 

Pr«pj» 

log  Koc  ® 

2.410 

1  Dairta  Saved  Values  ||  Set  Default  Values 

Compartment  Density  Uolume  7.  Organic 

kg/m^  m^  Carbon 

3.  Calculate 
Distribution 

log  H  0 

(dtamsfoBlcss) 

-0.618  1 
log  BCF  □ 
2.45 

Rl  Indicates 
▼alu*  frrnn  DB 

1.19  lelO 

13  Ulater  1000  J.?J& _ 

HSoll  ...15.Q.Q_„  Ssl _  .Z.^ . 

R|  Susp.  Solids  1 500  3.5_._.  _  4 

3  Sediment  ..ISfiO  ..  .ZAsA..  A _ 

13^  Biota  1 000  3.5 

After  the  user  inputs  all  the  required  information,  the  model  calculations  are  performed  in 
HyperTalk  and  the  results  are  presented  in  both  in  tabular  and  graphical  form  as  illustrated  in 
Figure  7.  The  results  can  be  viewed  graphically  in  either  bar  or  pie  chart  forms  in  terms  of  the 
concentration  or  percent  of  the  chemical  in  each  phase. 
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1  4  File  Edit  Go  Print 

<p  1 

1  Fugacity  Results 

Chemical  Name:  chlorobenzene 
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PEP  Heip 

Information  regarding  the  operation  of  each  property  estimation  module  and  the  chemical 
property  database  are  available  in  the  PEP  Help  stack  which  is  easily  accessed  at  any  time  within 
the  PEP  system.  The  organization  and  layout  of  each  help  card  is  similar  to  that  illustrated  in 
Figure  8  for  the  MCI  module.  The  user  can  select  the  topic  of  interest  by  clicking  on  the 
appropriate  radio  button  and  the  information  on  that  subject  will  be  displayed  in  the  scrolling  field. 
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Print 

MCI  Module 

MCI  Options 

®  oueruieui 
O  Input 

O  calculating  MCls 

O  choosing  the  model 
O'  the  chemical  class 

O  property  estimation 

O  limitations 


Molecular  connectivity  developed  by  Randle  ( 1 972)  and 

o 

refined  and  expanded  by  (Kier  and  Hail,  1976,  1 980,  1 986)  is 

a  method  of  bond  counting  from  which  topological  indexes,  based 

on  the  structure  of  the  compound,  can  be  derived.  For  a  given 

ijiiiii 

molecular  structure,  several  types  and  orders  of  molecular 

lllilii 

connectivity  Indexes  (MCls)  can  be  calculated.  Information  on 

nil 

the  molecular  size,  branching,  cyclization,  unsaturation,  and 

liiljj: 

heteroatom  content  of  a  molecule  is  encoded  in  these  various 

iiilili 

indices  (Kier  and  Hall,  1 976).  Molecular  connectivity  has  been 

ipi 

used  to  predict  Koc  (Sabljic,  1 984,  Sabljic,  1 987,  Bahnick  and 

Doucette,  1988),S  (Doucette,  1985,  Nirmalakhandan  and 

m 

Speece,  1 988a),  Kow  (Doucette  and  Andren,  1988),  H 

(Nirmalakhandan  and  Speece,  1988B)  and  BCFs  (Sabljic, 

liii 

1  987).  One  advantage  of  using  MCls  property  relationships  over 

ill 

property  property  relationships  to  predict  physical -chemical 

ii 

properties  is  that  once  the  correlation  has  been  developed  only 

the  structure  of  the  chemical  of  interest  is  required  as  input.  No 

s 

Chemical  Property  Database 

Experimentally  determined  physical  property  data  for  about  700  compounds,  having  at 
least  one  value  of  aqueous  solubility  (S),  octanol/water  partition  coefficient  (Kow),  vapor  pressure 
(Pv),  organic  carbon  normalized  soil  sorption  coefficient  (Koc),  bioconcentration  factor  (BCF),  or 
Henry’s  law  constant  (H),  was  complied  from  a  variety  of  literature  sources  and  computerized 
databases.  Using  this  information,  a  chemical  property  database  was  constructed  using 
HyperCard™  and  subsequently  used  for  developing  MCI-property,  TSA-property  and  property- 
property  relationships.  In  addition  to  the  properties  listed  above,  the  database  includes  the 
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following  information:  compound  name  and  synonyms,  a  diagram  of  the  2-D  chemical  structure, 
CAS  number,  chemical  formula,  molecular  weight  (MW),  boiling  point  (BP),  melting  point  (MP), 
and  appropriate  references  for  each  value.  A  built-in  unit  conversion  utility  enables  users  to 
quickly  view  property  values  in  a  variety  of  commonly  used  units.  The  database  is  directly 
connected  to  the  PEP  Pnx:essor  stack. 

The  Chemical  Property  Database  also  provides  the  means  for  the  user  to  search  for 
chemical  compounds  by  full  or  partial  name  or  synonym,  to  sort  the  compounds  by  name,  boiling 
point,  melting  point,  or  molecular  weight,  and  the  ability  to  transfer  to  any  of  the  property 
estimation  modules.  In  addition,  the  user  can  easily  edit  exiting  values  or  add  new  information. 
The  chemical  property  database  screen  is  illustrated  in  Figure  9. 


4  File  Edit  Go  Find  Sort  Add  D< 

llll■llll■lllll^|  111  iiiiii  ii 

toluene 

Synongms 

BENZENE,  1 -METHYL-2 ,4-DWrrRO- 

2.4- OINrrROTOLUENE 

2.4- ONT 

tmtiSi 

CASNombcr:  108-88-3 

Fori.au :  „C&H|CH.3 _ 

MP: 

BP:  110.6  *0^!^  MV:  92.13 

SMILES  SlrUf :  C(CCC1C)CC1 _ 

QBBantilog  PAgsicoI/Cboiaicol  Pro|»«rti«s 

I09  Aquoous  Solubility  (S) 

toy  Octano1/Vat«r  Partition  Coofficiants  (Kow) 

log  Vapor  Prassura  (Pv) 

log  Hanry  ‘s  Law  Constant  (H) 

log  Soil/Watar  Sorption  Coafficiant  (oc-basa<))  (Ko 

log  Bloconcantration  Factor  (BCF) 

log  Acid  Dissociation  (Ka) 

Voluos  Temp*C  Units  Nof 

-2.250  25  Imolas/Litar 

12.690  25  h 

1  -0.562  25  1 1  dimansionlass 

t’igure  9  Example  card  from  the  Chemical  Property  Data  Base 
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Biodegradation  Database 


A  prototype  database,  containing  information  regarding  the  biodegradability  of  organic 
compounds  is  also  being  developed  for  incorporation  into  the  PEP  software  system.  This 
database,  currently  containing  information  for  33  chemicals,  will  be  used  to  develop  and  evaluate 
relationships  between  structure  and  biodegradability.  If  successful,  the  resulting  structure- 
biodegradability  relationships  will  be  incorporated  into  PEP  during  the  third  year  of  the  project.  In 
its  current  state,  the  biodegradation  database  contains  the  following  information:  compound  name, 
structure,  SMILES  string,  molecular  weight,  aqueous  solubility,  octanol/water  partition 
coefficient,  reference,  matrix  (soil,  culture),  study  type  (microcosm,  field,  liquid  culture),  endpoint 
(mineralization,  disappearance  of  parent  compound,  identification  of  intermediate  degradation 
product),  chemical  concentration,  percent  loss  of  chemical  over  time,  degradation  rate  constant  and 
order,  half-life,  organism(s),  and  environmental  variables  (temperature,  pH,%  soil  organic  matter). 
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2  4  5-trich1oroDhenoxiiacetic  acid 

l^^^^^sti^cture  i 

CAS  Number:  93-76-5 

Forimila :  _C.9M§!i!.l.3.0_3 _ _ 

OCHjCOOH 

1  * 

FULL  REFERENCE 

Gibson^  S.  A.  and  Sufifta,  J.  M.  1 990.  Anaerobic 

fuf 

biodeQradation  of  2,4, S-trlehlorophenoxy acetic  acid 

In  samples  from  a  methanoyenic  aquifer :  stimulation 

i! 

Cl  ^ 

by  short-chain  organic  acids  and  alcohols.  Applied 

Ill 

Cl 

and  Environmental  Microbiology .  5(6)  : 

1825-1832. 

MATRIX  STUDY  TYPE _  OaCOWDITIOHS  EWDPOIHT _ 
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CHEMICAL  CONC.  IS  LOSS/TIME  RATE  CONSTANT  (k)  ORDER  <  1  /2 

300-500UM . .  85/12  v««ks _  _  _  _ 

ORGANISM(S) 

Anaerobic  sediment  from  anoxic  aquifer _ 


COMMENTS _ ENVIRONMENTAL  VARIABLES 


Autoclaved  controls  showed  no  disappearance  of 
2,4,5-T  nor  the  appearance  of  degradation  products. 
After  twelve  weeks  of  incubation,  about  85%  of  the 

r»tr*n1  wtArial  <«niild  h»  «/«/»wiin1»H  fnr  *■»  •  nf _ 

■ 

temperature  (‘C) :  ? 

CEC  ;  ? 
pH:  ? 

«0M:  ? 

■ 

find  1  sort  |  aidd  |  delete  print  expert  PEP 

|pEp  ®1  (2  no  ifil  <pi— rQ 

Figure  10  Example  Card  fiom  PEP  Biodegradation  Database 
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SUMMARY 


A  microcomputer  program  for  estimating  physical/chemical  properties  of  organic  chemicals 
for  use  in  environmental  fate  modeling  has  been  described.  The  Property  Estimation  Program 
(PEP)  and  associated  physical  property  database  (PEP.DB)  was  developed  using  HyperCard  for 
the  Apple  Macintosh  series  of  computers.  The  PEP  system  utilizes  both  quantitative  structure- 
property  and  property-property  relationships  (QSPRs  and  QPPRs)  to  provide  the  user  with  several 
approaches  to  estimate  S,  Kow,  Pv,  H,  Koc  and  BCF  depending  on  the  information  available. 
While  QPPRs  have  been  used  by  both  experts  and  non-experts  for  estimating  properties,  one  of  the 
major  limitations  in  using  QSPRs  has  been  the  difficulty  in  using  the  necessary  software  tools. 
The  graphical  interface  and  flow  chart  design  of  PEP  leads  the  user  through  a  series  of  logical  steps 
designed  to  provide  even  non-experts  with  a  economical,  easy  to  use  software  system  for  property 
estimation.  The  structural  information  for  the  MCI  and  UNIFAC  modules  can  be  input  using 
Simplified  Molecular  Input  Line  Entry  System  (SMILES)  notation  or  connection  tables  generated 
from  a  commercially  available  two-dimensional  drawing  program.  The  TSA  module  accepts  3-D 
cartesian  coordinates  entered  manually  or  directly  reads  coordinate  files  generated  by  molecular 
modeling  software.  For  each  property  the  user  can  select  from  either  '’universal"  or  class  specific 
regression  models.  The  program's  built  in  intelligence  helps  the  user  choose  the  most  appropriate 
QSPR  based  on  the  structure  of  the  chemical  of  interest.  In  addition,  sufficient  statistical 
information  is  provided  to  allow  the  user  to  determine  on  the  validity  of  the  QSPRs  and  QPPRs 
utilized  in  PEP.  Designed  to  make  the  program  both  practical  and  educational,  on  line 
documentation  is  provided  not  only  for  the  operational  characteristics  of  the  program  but  also  for 
the  theory  associated  with  the  property  estimation  techniques.  The  combination  of  the  various 
property  estimation  methods,  chemical  property  database,  and  simple  environmental  fate  model 
(Level  1  Fugacity  Model)  illustrates  the  potential  application  of  PEP  in  both  educational  and 
regulatory  settings. 
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SUMMARY  OF  SECOND  YEAR  ACCOMPLISHMENTS 
AND  THIRD  YEAR  OBJECTIVES 

PEP  improvements  and  modifications 

1 .  Chemical  Property  Database 

a)  Database  layout  and  user  interface  improved.  Pop  up  buttons  for  database  functions 
were  replaced  with  pull  down  menus  to  make  PEP  more  like  standard  Macintosh 
applications. 

b)  Added  additional  information  for  approximately  100  new  compounds. 

c) .  Added  export  feature  that  allows  user  to  export  all  information  in  database  as  text 

d)  Added  report  feature  that  allows  user  to  export  all  information  on  specific  card  (i.e. 
specific  chemical)  as  text 

e)  Added  feature  that  enables  user  to  ability  to  add  and  delete  references  from  database. 

In  summary,  the  PEP  chemical  property  database  is  fully  functional.  No  additional  features 
or  changes  to  user  interface  are  planned  for  the  final  year  of  the  project.  New  chemical 
property  data  will  be  added  as  it  becomes  available. 

2. PEP  Processor 

a)  MQ  module 

1)  Developed  decision  support  system  for  choosing  most  appropriate  QSPR  based  on 
chemical  class. 

2)  The  MCI  calculation  algorithm  was  changed  from  an  C  application  external  to 
HyperCard  to  a  HyperCard  external  function.  This  change  yielded  an  approximate  ten 
fold  increase  in  the  efficiency  of  the  MCI  calculation  algorithm.  Versions  were 
complied  for  machines  with  and  without  a  Roaring  Point  Processor  (FPU). 

3)  MCI  module  modified  to  accept  both  hydrogen  suppressed  and  non-hydrogen 
suppressed  connection  tables  automatically. 
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4)  User  interface  changed  to  flow  chart  format  for  greater  ease  of  use  and  consistency 
between  modules. 

In  summaiy,  MCI  module  is  fully  functional.  In  the  final  year  of  the  project,  we  will 
continue  to  refine  the  MCI-property  relationships.  The  relationship  between  MCIs  and 
properties  such  as  polarizability,  dipole  moment,  partial  atomic  charge,  and  linear 
solvation  parameters  will  be  investigated 

b)  UNIFAC  module 

1.  Completed  decision  support  system  for  dissecting  SMILES  strings  or  connection 
tables  into  appropriate  UNIFAC  groups  and  retrieving  necessary  group  values  from 
UNIFAC  database. 

2.  User  interface  changed  to  flow  chart  format  for  greater  ease  of  use  and  consistency 
between  modules. 

In  summary,  the  UNIFAC  module  is  fully  functional.  No  changes  to  the  user  interface 
are  planned  for  the  final  year  of  the  project  During  the  third  year  of  the  project,  the 
validity  of  the  UNIFAC  estimates  will  be  examined  using  a  test  set  of  compounds 
having  experimentally  measured  physical  property  data.  Property  estimates  from  the 
UNIFAC  model  will  also  be  compared  to  estimates  made  with  the  other  PEP  modules 
and  with  other  literature  methods.  One  additional  property,  the  oil/water  partition 
coefficient,  will  also  be  investigated  and  incorporated  into  the  UNIFAC  module. 

c)  Property  property  module 

1 .  User  interface  changed  to  flow  chart  format  for  greater  ease  of  use  and  consistency 
between  modules.  In  the  final  year  of  the  project,  additional  property-property 
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correlations,  obtained  in  this  study  and  from  the  literature,  will  be  implemented.  In 
addition,  the  validity  of  the  Property-property  relationships  will  be  examined  using  a 
test  set  of  compounds  having  experimentally  measured  physical  property  data. 

c)  TSA  module 

1)  Converted  VAX  fortran  version  of  Pearlman’s  SAVOL2  [19]  algorithm  for 
calculating  TSA  to  C  application  on  Macintosh,  then  converted  C  application  to 
HyperCard  external  function. 

2)  Investigated  both  universal  and  class  specific  TSA-property  relationships  and 
implemented  a  prototype  version  of  TSA  module  that  uses  TSA-property  relationships 
to  estimate  S,  Kow,  Koc,  Pv,  H,  BCF. 

3)  Investigated  the  relationship  between  TSA  and  partial  atomic  charge  and  S,  Kow, 
Koc,  Pv,  H,  BCF  . 

During  the  third  year  of  the  project,  the  TSA-property  relationships  will  be  refined  and 
finalized  and  the  validity  of  the  resulting  relationships  will  be  examined  using  a  test  set 
of  compounds  having  experimentally  measured  physical  property  data. 

d)  PEP  Help 

1)  Added  help  module  that  users  can  access  from  any  location  in  the  PEP  software 
system.  The  PEP  Help  module  provides  the  user  with  information  regarding  the 
operation  of  the  various  PEP  modules.  In  addition,  the  PEP  Help  module  also 
provides  information  regarding  the  calculation  of  MCIs,  TSA,  and  UNIF AC-derived 
activity  coefficients  and  the  subsequent  development  of  associated  QSPRs. 
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e)  PEP  models 

1)  Implemented  Level  1  Fugacity  Model  developed  by  Mackay  [3]  into  HyperCard  and 
linked  it  directly  to  the  property  estimation  modules. 

f)  PEP  biodegradation  database 

1)  A  prototype  database,  containing  information  regarding  the  biodegradability  of 
organic  compounds,  is  being  developed  for  incorporation  into  the  PEP  software 
system.  This  database,  currently  containing  information  for  33  chemicals,  will  be  used 
to  develop  and  evaluate  relationships  between  structure  as  described  by  MCIs  and  TSA 
and  biodegradability.  If  successful,  the  resulting  structure-biodegradability 
relationships  will  be  incorporated  into  PEP  during  the  third  year  of  the  project. 
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